Detecting Hidden Multiwords in Bilingual Dictionaries
نویسندگان
چکیده
Dictionaries are a valuable source of information about multiwords. Unfortunately, only few multiwords are explicitly marked as such in dictionaries: most of them are presented without being distinguished from free combinations of words. In this paper we present a methodology for detecting hidden multiwords in bilingual dictionaries, along with their translation in another language. The methodology is based on a number of automatic procedures which exploit regularities in the different kinds of expressions that can be found in the Collins English-Italian bilingual dictionary to select those phrases that are most likely to contain multiwords. The quantitative results of the experiment are provided.
منابع مشابه
Extraction of Bilingual Cognates from Wikipedia
In this article, we propose a method to extract translation equivalents with similar spelling from comparable corpora. The method was applied on Wikipedia to extract a large amount of PortugueseSpanish bilingual terminological pairs that were not found in existing dictionaries. The resulting bilingual lexicons consists of more than 27, 000 new pairs of lemmas and multiwords, with about 92% accu...
متن کاملOn multiword lexical units and their role in maritime dictionaries
Multi-word lexical units are a typical feature of specialized dictionaries, in particular monolingual and bilingual maritime dictionaries. The paper studies the concept of the multi-word lexical unit and considers the similarities and differences of their selection and presentation in monolingual and bilingual maritime dictionaries. The work analyses such issues as the classification of multi-w...
متن کاملConstrained Hidden Markov Model for Bilingual Keyword Pairs Alignment
Bilingual terminology dictionaries are resources of much practical importance in many application of bilingual NLP. Because technical terminology can be both very specific and rapidly evolving, it can however be difficult to obtain dictionaries with good coverage. Mining automatically such terminology from technical documents is therefore an attractive possibility. With this goal in mind, and f...
متن کاملAn Investigation into Bilingual Dictionary Use: Do the Frequency of Use and Type of Dictionary Make a Difference in L2 Writing Performance?
Bilingual dictionary use in L2 writing test performance has recently been the subject of debate. Opinions differ according to how the trait is understood and whether the system favors the process-oriented or product-oriented views towards the assessment and writing skill. Given the need for more empirical support, this study is aimed at investigating the availability of bilingual dictionary use...
متن کاملAcquisition of Bilingual MT Lexicons from OCRed Dictionaries
This paper describes an approach to analyzing the lexical structure of OCRed bilingual dictionaries to construct resources suited for machine translation of low-density languages, where online resources are limited. A rule-based, an HMM-based, and a post-processed HMM-based method are used for rapid construction of MT lexicons based on systematic structural clues provided in the original dictio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004